智能论文笔记

Multi-class Classifier based Failure Prediction with Artificial and Anonymous Training for Data Privacy

Dibakar Das , Vikram Seshasai , Vineet Sudhir Bhat , Pushkal Juneja , Jyotsna Bapat , Debabrata Das

分类：人工智能 | 机器学习

2022-09-06

本文提出了一种新型的非侵入系统故障预测技术，使用来自开发人员的可用信息，以及来自原始日志中的最小信息（而不是挖掘整个日志），但与数据所有者完全保持数据。基于神经网络的多级分类器是为故障预测而开发的，使用人为生成的匿名数据集，应用技术组合，即遗传算法（步骤），模式重复等，以训练和测试网络。提出的机制完全将用于培训过程的数据集与保留私有数据的数据集分解。此外，多标准决策（MCDM）方案用于优先考虑满足业务需求的失败。结果显示在不同参数配置下的故障预测准确性。在更广泛的上下文上，可以使用提出的机制具有人工生成的数据集执行任何分类问题，而无需查看实际数据，只要输入功能可以转换为二进制值（例如，来自私有二进制分类器的输出）并可以提供分类 - 服务。

translated by 谷歌翻译

Semi-Structured Object Sequence Encoders

Rudra Murthy V , Riyaz Bhat , Chulaka Gunasekara , Hui Wan , Tejas Indulal Dhamecha , Danish Contractor , Marina Danilevsky

分类：计算机视觉 | 人工智能 | 自然语言处理

2023-01-03

In this paper we explore the task of modeling (semi) structured object sequences; in particular we focus our attention on the problem of developing a structure-aware input representation for such sequences. In such sequences, we assume that each structured object is represented by a set of key-value pairs which encode the attributes of the structured object. Given a universe of keys, a sequence of structured objects can then be viewed as an evolution of the values for each key, over time. We encode and construct a sequential representation using the values for a particular key (Temporal Value Modeling - TVM) and then self-attend over the set of key-conditioned value sequences to a create a representation of the structured object sequence (Key Aggregation - KA). We pre-train and fine-tune the two components independently and present an innovative training schedule that interleaves the training of both modules with shared attention heads. We find that this iterative two part-training results in better performance than a unified network with hierarchical encoding as well as over, other methods that use a {\em record-view} representation of the sequence \cite{de2021transformers4rec} or a simple {\em flattened} representation of the sequence. We conduct experiments using real-world data to demonstrate the advantage of interleaving TVM-KA on multiple tasks and detailed ablation studies motivating our modeling choices. We find that our approach performs better than flattening sequence objects and also allows us to operate on significantly larger sequences than existing methods.

translated by 谷歌翻译

Benchmarking Spatial Relationships in Text-to-Image Generation

Tejas Gokhale , Hamid Palangi , Besmira Nushi , Vibhav Vineet , Eric Horvitz , Ece Kamar , Chitta Baral , Yezhou Yang

分类：计算机视觉 | 人工智能 | 自然语言处理

2022-12-20

Spatial understanding is a fundamental aspect of computer vision and integral for human-level reasoning about images, making it an important component for grounded language understanding. While recent large-scale text-to-image synthesis (T2I) models have shown unprecedented improvements in photorealism, it is unclear whether they have reliable spatial understanding capabilities. We investigate the ability of T2I models to generate correct spatial relationships among objects and present VISOR, an evaluation metric that captures how accurately the spatial relationship described in text is generated in the image. To benchmark existing models, we introduce a large-scale challenge dataset SR2D that contains sentences describing two objects and the spatial relationship between them. We construct and harness an automated evaluation pipeline that employs computer vision to recognize objects and their spatial relationships, and we employ it in a large-scale evaluation of T2I models. Our experiments reveal a surprising finding that, although recent state-of-the-art T2I models exhibit high image quality, they are severely limited in their ability to generate multiple objects or the specified spatial relations such as left/right/above/below. Our analyses demonstrate several biases and artifacts of T2I models such as the difficulty with generating multiple objects, a bias towards generating the first object mentioned, spatially inconsistent outputs for equivalent relationships, and a correlation between object co-occurrence and spatial understanding capabilities. We conduct a human study that shows the alignment between VISOR and human judgment about spatial understanding. We offer the SR2D dataset and the VISOR metric to the community in support of T2I spatial reasoning research.

translated by 谷歌翻译

Unsupervised Dense Retrieval Deserves Better Positive Pairs: Scalable Augmentation with Query Extraction and Generation

Rui Meng , Ye Liu , Semih Yavuz , Divyansh Agarwal , Lifu Tu , Ning Yu , Jianguo Zhang , Meghana Bhat , Yingbo Zhou

分类：自然语言处理

2022-12-17

Dense retrievers have made significant strides in obtaining state-of-the-art results on text retrieval and open-domain question answering (ODQA). Yet most of these achievements were made possible with the help of large annotated datasets, unsupervised learning for dense retrieval models remains an open problem. In this work, we explore two categories of methods for creating pseudo query-document pairs, named query extraction (QExt) and transferred query generation (TQGen), to augment the retriever training in an annotation-free and scalable manner. Specifically, QExt extracts pseudo queries by document structures or selecting salient random spans, and TQGen utilizes generation models trained for other NLP tasks (e.g., summarization) to produce pseudo queries. Extensive experiments show that dense retrievers trained with individual augmentation methods can perform comparably well with multiple strong baselines, and combining them leads to further improvements, achieving state-of-the-art performance of unsupervised dense retrieval on both BEIR and ODQA datasets.

translated by 谷歌翻译

EM-Paste: EM-guided Cut-Paste with DALL-E Augmentation for Image-level Weakly Supervised Instance Segmentation

Yunhao Ge , Jiashu Xu , Brian Nlong Zhao , Laurent Itti , Vibhav Vineet

分类：计算机视觉

2022-12-15

We propose EM-PASTE: an Expectation Maximization(EM) guided Cut-Paste compositional dataset augmentation approach for weakly-supervised instance segmentation using only image-level supervision. The proposed method consists of three main components. The first component generates high-quality foreground object masks. To this end, an EM-like approach is proposed that iteratively refines an initial set of object mask proposals generated by a generic region proposal method. Next, in the second component, high-quality context-aware background images are generated using a text-to-image compositional synthesis method like DALL-E. Finally, the third component creates a large-scale pseudo-labeled instance segmentation training dataset by compositing the foreground object masks onto the original and generated background images. The proposed approach achieves state-of-the-art weakly-supervised instance segmentation results on both the PASCAL VOC 2012 and MS COCO datasets by using only image-level, weak label information. In particular, it outperforms the best baseline by +7.4 and +2.8 mAP0.50 on PASCAL and COCO, respectively. Further, the method provides a new solution to the long-tail weakly-supervised instance segmentation problem (when many classes may only have few training samples), by selectively augmenting under-represented classes.

translated by 谷歌翻译

Neural Continuous-Time Markov Models

Majerle Reeves , Harish S. Bhat

分类： (统计)机器学习 | 机器学习

2022-12-11

Continuous-time Markov chains are used to model stochastic systems where transitions can occur at irregular times, e.g., birth-death processes, chemical reaction networks, population dynamics, and gene regulatory networks. We develop a method to learn a continuous-time Markov chain's transition rate functions from fully observed time series. In contrast with existing methods, our method allows for transition rates to depend nonlinearly on both state variables and external covariates. The Gillespie algorithm is used to generate trajectories of stochastic systems where propensity functions (reaction rates) are known. Our method can be viewed as the inverse: given trajectories of a stochastic reaction network, we generate estimates of the propensity functions. While previous methods used linear or log-linear methods to link transition rates to covariates, we use neural networks, increasing the capacity and potential accuracy of learned models. In the chemical context, this enables the method to learn propensity functions from non-mass-action kinetics. We test our method with synthetic data generated from a variety of systems with known transition rates. We show that our method learns these transition rates with considerably more accuracy than log-linear methods, in terms of mean absolute error between ground truth and predicted transition rates. We also demonstrate an application of our methods to open-loop control of a continuous-time Markov chain.

translated by 谷歌翻译

Drift Identification for Lévy alpha-Stable Stochastic Systems

Harish S. Bhat

分类： (统计)机器学习 | 机器学习

2022-12-06

This paper focuses on a stochastic system identification problem: given time series observations of a stochastic differential equation (SDE) driven by L\'{e}vy $\alpha$-stable noise, estimate the SDE's drift field. For $\alpha$ in the interval $[1,2)$, the noise is heavy-tailed, leading to computational difficulties for methods that compute transition densities and/or likelihoods in physical space. We propose a Fourier space approach that centers on computing time-dependent characteristic functions, i.e., Fourier transforms of time-dependent densities. Parameterizing the unknown drift field using Fourier series, we formulate a loss consisting of the squared error between predicted and empirical characteristic functions. We minimize this loss with gradients computed via the adjoint method. For a variety of one- and two-dimensional problems, we demonstrate that this method is capable of learning drift fields in qualitative and/or quantitative agreement with ground truth fields.

translated by 谷歌翻译

Multi-Task Learning Framework for Extracting Emotion Cause Span and Entailment in Conversations

Ashwani Bhat , Ashutosh Modi

分类：自然语言处理 | 人工智能 | 机器学习

2022-11-07

Predicting emotions expressed in text is a well-studied problem in the NLP community. Recently there has been active research in extracting the cause of an emotion expressed in text. Most of the previous work has done causal emotion entailment in documents. In this work, we propose neural models to extract emotion cause span and entailment in conversations. For learning such models, we use RECCON dataset, which is annotated with cause spans at the utterance level. In particular, we propose MuTEC, an end-to-end Multi-Task learning framework for extracting emotions, emotion cause, and entailment in conversations. This is in contrast to existing baseline models that use ground truth emotions to extract the cause. MuTEC performs better than the baselines for most of the data folds provided in the dataset.

translated by 谷歌翻译

Causal Structural Hypothesis Testing and Data Generation Models

Jeffrey Jiang , Omead Pooladzandi , Sunay Bhat , Gregory Pottie

分类：机器学习 | 人工智能 | (统计)机器学习

2022-10-20

A vast amount of expert and domain knowledge is captured by causal structural priors, yet there has been little research on testing such priors for generalization and data synthesis purposes. We propose a novel model architecture, Causal Structural Hypothesis Testing, that can use nonparametric, structural causal knowledge and approximate a causal model's functional relationships using deep neural networks. We use these architectures for comparing structural priors, akin to hypothesis testing, using a deliberate (non-random) split of training and testing data. Extensive simulations demonstrate the effectiveness of out-of-distribution generalization error as a proxy for causal structural prior hypothesis testing and offers a statistical baseline for interpreting results. We show that the variational version of the architecture, Causal Structural Variational Hypothesis Testing can improve performance in low SNR regimes. Due to the simplicity and low parameter count of the models, practitioners can test and compare structural prior hypotheses on small dataset and use the priors with the best generalization capacity to synthesize much larger, causally-informed datasets. Finally, we validate our methods on a synthetic pendulum dataset, and show a use-case on a real-world trauma surgery ground-level falls dataset.

translated by 谷歌翻译

Ground then Navigate: Language-guided Navigation in Dynamic Scenes

Kanishk Jain , Varun Chhangani , Amogh Tiwari , K. Madhava Krishna , Vineet Gandhi

分类：计算机视觉

2022-09-24

我们在室外环境中自动驾驶的背景下研究了视觉和语言导航（VLN）问题。我们通过明确接地与Textual命令相对应的可通道区域来解决问题。在每个时间戳，该模型预测与中间或最终可通道区域相对应的分割掩码。我们的工作与VLN中的现有工作形成鲜明对比，VLN的现有工作将该任务置于节点选择问题，并且给定与环境相对应的离散连接图。我们不假定这种离散的地图的可用性。我们的工作朝着动作领域的连续性发展，通过视觉反馈提供了解释性，并允许在需要更精细的操作的命令上进行VLN，例如“两辆汽车之间的停车”。此外，我们提出了一种新型的元数据carla-nav，以允许有效的训练和验证。该数据集包括预录制的培训序列以及用于验证和测试的实时环境。我们提供广泛的定性和定量经验结果，以验证所提出的方法的功效。

translated by 谷歌翻译